Biogeometry Research Faster Multiple Sequence Alignment Algorithms Based on Pairwise Segmentation

نویسندگان

  • Pankaj K. Agarwal
  • Yonatan Bilu
  • Rachel Kolodny
چکیده

Multiple Sequence Alignment (MSA) is a central problem in computational molecular biology --it identifies and quantifies similarities among several protein or DNA sequences.The well-known dynamic programming (DP) algorithms align k sequences (each of length n) by constructing a k-dimensional grid graph of size O(nk), with each of the sequences enumerating one of the dimensions of the grid. The optimal MSA is an optimal path from (n,...,n) to (0,...,0). Unfortunately, the exponential running time makes this approach prohibitive even for modest values of n and k. There is little hope for improving the worst-case efficiency of an algorithm for this problem since the MSA problem is NP-Hard [1]. However, the sequences constructed in the lowerbound constructions are not representative of protein and DNA sequences abundant in nature, and the alignments are not reminiscent of ones studied in practice. This led to the question whether faster algorithms could be developed for protein and DNA sequences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

An Application of the ABS LX Algorithm to Multiple Sequence Alignment

We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...

متن کامل

Statistical Alignment: Recent Progress, New Applications, and Challenges

Two papers by Thorne, Kishino, and Felsenstein in the early 90’s provided a basis for performing alignment within a statistical framework. Here we review progress and associated challenges in the investigation of models of insertions and deletions in biological sequences stemming from this early work. In the last few years this approach to sequence analysis has experienced a renaissance and rec...

متن کامل

Multiple Structural Rna Alignment with Affine Gap Costs Based on Lagrangian Relaxation

In this thesis the structural alignment of RNA sequences is addressed, a topic of crucial significance in the field of computational biology. Contrary to alignments of DNA, alignments of RNA are not only aligned based on sequence information, but largely depend on the correct structural alignment. Since the functions of RNA depend mostly on its secondary structure and this is highly conserved t...

متن کامل

Improving accuracy of multiple sequence alignment algorithms based on alignment of neighboring residues

While most of the recent improvements in multiple sequence alignment accuracy are due to better use of vertical information, which include the incorporation of consistency-based pairwise alignments and the use of profile alignments, we observe that it is possible to further improve accuracy by taking into account alignment of neighboring residues when aligning two residues, thus making better u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004